uml class model
PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4
Abukhalaf, Seif, Hamdaqa, Mohammad, Khomh, Foutse
The rapid progress of AI-powered programming assistants, such as GitHub Copilot, has facilitated the development of software applications. These assistants rely on large language models (LLMs), which are foundation models (FMs) that support a wide range of tasks related to understanding and generating language. LLMs have demonstrated their ability to express UML model specifications using formal languages like the Object Constraint Language (OCL). However, the context size of the prompt is limited by the number of tokens an LLM can process. This limitation becomes significant as the size of UML class models increases. In this study, we introduce PathOCL, a novel path-based prompt augmentation technique designed to facilitate OCL generation. PathOCL addresses the limitations of LLMs, specifically their token processing limit and the challenges posed by large UML class models. PathOCL is based on the concept of chunking, which selectively augments the prompts with a subset of UML classes relevant to the English specification. Our findings demonstrate that PathOCL, compared to augmenting the complete UML class model (UML-Augmentation), generates a higher number of valid and correct OCL constraints using the GPT-4 model. Moreover, the average prompt size crafted using PathOCL significantly decreases when scaling the size of the UML class models.
Addressing Semantic Ambiguities in Natural Language Constraints
Bajwa, Imran Sarwar (University of Birmingham) | Lee, Mark (University of Birmingham) | Bordbar, Behzad (University of Birmingham) | Ali, Ahsan (Queens Academic Group)
In NL2OCL project, we aim to translate English specification of constraints to formal constraints such as OCL (Object Constraint Language). In English to OCL translation, our contribution is a semantic analyzer that uses the output of the Stanford parser for shallow and deep semantic parsing. Our analysis of the output of shallow semantic parsing showed that semantic roles were mis-identified for a few English constraints due to semantic ambiguity. Similarly, in deep semantic parsing, it is difficult to resolve scope of quantifier operators due to scope ambiguity that is another sub-type of semantic ambiguity. In this paper, we highlight the identified cases of semantic ambiguities in English constraints. We also present a novel approach to automatically resolve the identified cases of the semantic ambiguities. The presented approach is also evaluated to show that by addressing the identified cases of semantic ambiguities, we can generate more accurate and complete formal (OCL) specifications.
Semantic Analysis of English Specification of OCL
Bajwa, Imran Sarwar (University of Birmingham) | Lee, Mark (University of Birmingham) | Bordbar, Behzad (University of Birmingham)
In this paper, we present a novel approach NL2OCL to translate English specification of constraints to formal constraints such as OCL (Object Constraint language). In the used approach, input English constraints are syntactically and semantically analyzed to generate a SBVR (Semantics of Business Vocabulary and Rules) based logical representation that is finally mapped to OCL. During the syntactic and semantic analysis we have also addressed various syntactic and semantic ambiguities that make the presented approach robust. The presented approach is implemented in Java as a proof of concept. A case study has also been solved by using our tool to evaluate the accuracy of the presented approach. The results of evaluation are also compared to the pattern based approach to highlight the significance of the used approach.
SBVR Business Rules Generation from Natural Language Specification
Bajwa, Imran Sarwar (University of Birmingham) | Lee, Mark G. (University of Birmingham) | Bordbar, Behzad (University of Birmingham)
In this paper, we present a novel approach of translating natural languages specification to SBVR business rules. The business rules constraint business structure or control behaviour of a business process. In modern business modelling, one of the important phases is writing business rules. Typically, a business rule analyst has to manually write hundreds of business rules in a natural language (NL) and then manually translate NL specification of all the rules in a particular rule language such as SBVR, or OCL, as required. However, the manual translation of NL rule specification to formal representation as SBVR rule is not only difficult, complex and time consuming but also can result in erroneous business rules. In this paper, we propose an automated approach that automatically translates the NL (such as English) specification of business rules to SBVR (Semantic Business Vocabulary and Rules) rules. The major challenge in NL to SBVR translation was complex semantic analysis of English language. We have used a rule based algorithm for robust semantic analysis of English and generate SBVR rules. Automated generation of SBVR based Business rules can help in improved and efficient constrained business aspects in a typical business modelling.